Disturbance Injection Under Partial Automation: Robust Imitation Learning for Long-Horizon Tasks

نویسندگان

چکیده

Partial Automation (PA) with intelligent support systems has been introduced in industrial machinery and advanced automobiles to reduce the burden of long hours human operation. Under PA, operators perform manual operations (providing actions) that switch automatic/manual mode (mode-switching). Since PA reduces total duration operation, these two action mode-switching can be replicated by imitation learning high sample efficiency. To this end, letter proposes Disturbance Injection under (DIPA) as a novel framework. In DIPA, actions (in mode) are assumed observables each state used learn both policies. The above is robustified injecting disturbances into operator's optimize disturbance's level for minimizing covariate shift PA. We experimentally validated effectiveness our method long-horizon tasks simulations real robot environment confirmed outperformed previous methods reduced demonstration burden.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DART: Noise Injection for Robust Imitation Learning

One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy. A known problem with this “off-policy” approach is that the robot’s errors compound when drifting away from the supervisor’s demonstrations. On-policy, techniques alleviate this by iteratively collecting corrective actions for the current robot policy. However, these techn...

متن کامل

Iterative Noise Injection for Scalable Imitation Learning

One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy. A known problem with this “off-policy” approach is that the robot’s errors compound when drifting away from the supervisor’s demonstrations. On-policy, techniques alleviate this by iteratively collecting corrective actions for the current robot policy. However, these techn...

متن کامل

Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning

In this paper, we propose to combine imitation and reinforcement learning via the idea of reward shaping using an oracle. We study the effectiveness of the nearoptimal cost-to-go oracle on the planning horizon and demonstrate that the costto-go oracle shortens the learner’s planning horizon as function of its accuracy: a globally optimal oracle can shorten the planning horizon to one, leading t...

متن کامل

Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning

In this paper, we propose to combine imitation and reinforcement learning via the idea of reward shaping using an oracle. We study the effectiveness of the nearoptimal cost-to-go oracle on the planning horizon and demonstrate that the costto-go oracle shortens the learner’s planning horizon as function of its accuracy: a globally optimal oracle can shorten the planning horizon to one, leading t...

متن کامل

Robust and Incremental Robot Learning by Imitation

In the last years, Learning by Imitation (LbI) has been increasingly explored in order to easily instruct robots to execute complex motion tasks. However, most of the approaches do not consider the case in which multiple and sometimes conflicting demonstrations are given by different teachers. Nevertheless, it seems advisable that the robot does not start as a tabula-rasa, but re-using previous...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE robotics and automation letters

سال: 2023

ISSN: ['2377-3766']

DOI: https://doi.org/10.1109/lra.2023.3260586